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Abstract. Motivated by the hierarchial network model of E. Rav- 
asz, A.-L. Barabasi, and T. Vicsek [3] and [2J, we introduce de- 
terministic scale-free networks derived from a graph directed self- 
similar fractal A. With rigorous mathematical results we verify 
that our model captures some of the most important features of 
many real networks: the scale free and the high clustering prop- 
erties. We also prove that the diameter is the logarithm of the 
size of the system. Using our (deterministic) fractal A we generate 
random graph sequence sharing similar properties. 



1. Introduction 

In the last two decades there have been a considerable amount of at- 
tention paid to the study of complex networks like the World Wide 
Web, social networks, or biological networks. This resulted in the con- 
struction of numerous network models, see e.g. [I], [9], [7], [I], [10] 
[5]. Most of them use a version of preferential attachment and are of 
probabilistic nature. A completely different approach was initiated by 
Barabasi, Ravasz, and Vicsek [3]. They introduced deterministic net- 
work models generated by a method which is common in constructing 
fractals. Their model exhibits hierarchical structure and the degree se- 
quence obeys power law decay. To model also the clustering behavior 
of real networks, Ravasz and Barabasi [2] developed the original model 
so that their deterministic network model preserved the same power 
law decay and has similar clustering behavior to many real networks. 
Namely, the average local clustering coefficient is independent of the 
size of the network and the local clustering coefficient decays inversely 
proportional to the degree of the node. 

In this paper we generalize both of the models above. Starting from 
an arbitrary initial bipartite graph G on N vertices, we construct a 
hierarchical sequence of deterministic graphs G n . Namely, V(G n ), the 
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set of vertices of G n is {0, 1, . . . , iV — l} n . To construct G n from G n -i, 
we take N identical copies of G n -i? each of them identified with a 
vertex of G. Then we connect these components in a complicated way 
described in ([!]). In this way, G n contains iV ra_1 copies of G±, which 



are connected in a hierarchical manner, see Figures 1(a), |l(b) and [3^ 
for two examples. 

There are no triangles in G n . Hence, in order to model the clustering 
properties of many real networks, we need to extend the set of edges of 
our graph sequence to destroy the bipartite property. Motivated by 
we add some additional edges to G\ to obtain the (no longer bipartite) 
graph G\. Then we build up the graph sequence G n as follows: G n 
consist of iV™ -1 copies of G\, which copies are connected to each other 
in the same way as they were in G n . So, G n and G n have the same 
vertex set and their edges only differ at the lowest hierarchical level, 
that is, within the iV n-1 copies of G\ and Gi, see Figures [3] and [4J 
We give a rigorous proof of the fact that the average local clustering 
coefficient of G n does not depend on the size and the local clustering 
coefficient of a node with degree k is of order 1/k. 
The embedding of the adjacency matrix of the graph sequence G n is 
carried out as follows: A vertex x = (x± . . . x n ) is identified with the 
corresponding iV-adic interval I x (see Q). A„ is the union of those 
N~ n x N~ n squares I x x I y for which the vertices x, y are connected 
by an edge in G n . So, A n is the most straightforward embedding of 
the adjacency matrix of G n into the unit square. A n turns out to be a 
nested sequence of compact sets, which can be considered as the n-th 
approximation of a graph directed self-similar fractal A on the plane, 
see Figure l(c)[ We discuss connection between the graph theoretical 



properties of G n and properties of the limiting fractal A. 
Furthermore, using A we generate a random graph sequence G r n in a 
way which was inspired by the H^-random graphs introduced by Lovasz 
and Szegedy [TP] . See also Diaconis, Janson [5], which paper contains a 
list of corresponding references. We show that the degree sequence has 
power law decay with the same exponent as the deterministic graph 
sequence G n . Thus we can define a random graph sequence with a 
prescribed power law decay in a given range. Bollob'as, Janson and 
Riordan [5] considered inhomogeneous random graphs generated by a 
kernel. Our model is not covered by their construction, since A is a 
fractal set of zero two dimensional Lebesgue measure. 



The paper is organized as follows: In Section [2] we define the determin- 
istic model and the associated fractal set A. In Section [3j we verify the 
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scale free property of G n (Theorem 3.1). We compare the Hausdorff 
dimension of A to the power law exponent of the degree sequence of 
G n . Our next result is that both of the diameter of G n and the average 
length of shortest path between two vertices are of order of the loga- 



rithm of the size of G n (Corollary 3.6 and Theorem 3.7). In Section 3.4 



we prove the above me ntione d properties of the clustering coefficient of 

In Section [4 we describe the randomized 



3.13 



and 



3.11). 



G n (Theorem 

model, and in Section |5| we prove that the model exhibits the same 
power law decay as the corresponding deterministic version. 



2. Deterministic model 

The model was motivated by the hierarchical graph sequence model in 
[3], and is given as follows. 

2.1. Description of the model. Let G, our base graph, be any la- 
beled bipartite graph on the vertex set Ei = {0, . . . , N — 1}. We par- 
tition Ei into the non-empty sets Vi, V2 and one of the end points of 
any edge is in Vi, and the other is in V 2 . We write := |V^|, i = 1, 2 
for the cardinality of V*. The edge set of G is denoted by E{G). If the 
pair x, y G Ei is connected by an edge, then this edge is denoted by 
, since this notation makes it convenient to follow the labels of the 
vertices along a path. 

Now we define our graph sequence {G n } ngN generated by the base 
graph G. 

The vertex set is E n = {(xiX 2 ■ ■ - x n ) : Xj G Ex}, all words of length 
n above the alphabet E^ To be able to define the edge set, we need 
some further definitions. 

Definition 2.1. 

(1) We assign a type to each element of Hi. Namely, 



typ(x) 



1, ifxeVi; 

2, ifxeV 2 . 



(2) We define the type of a word z = (z\Z 2 ■ ■ ■ z n ) G E n as follows: 
if all the elements Zj,j = l,...,n of z fall in the same Vi, 
i = l,2 then typ(z) the type of z is i. Otherwise typ(z) := 0. 

(3) Forx = (xi . . . x n ),y = (yx . . . y n ) G E n we denote the common 
prefix by 

xAy=(zi...z k ) s.t. Xi = yi = Zi,Vi = 0, . . . ,k and x k+ i ^ y k+i . 
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(4) Given x — {x\... x n ),y — (y± . . . y n ) G S n; the postfixes x, y G 

^ n -\xAy\ are determined by 

x = (x A y)x, y=(xA y)y, 
where the concatenation of the words a, b is denoted by ab. 

Now we can define the edge set E(G n ). Two vertices x and y in G n are 
connected by an edge if and only if the following assumptions hold: 

(a) : One of the postfixes x, y is of type 1, the other is of type 2, 

(b) : for each % > \x A y|, the coordinate pair forms an edge 
in a. 

That is, E(G n ) C S n x S n : 



E{G n ) 




x = y or 



(1) {typ(x),typ(y)} = {l,2},V|x Ay| < i < n,^ ) eA'(G') 

Remark 2.2. Note that we artificially added all loops to the (otherwise 
bipartite) graph sequence G n , implying easier calculations later without 
loss of the important properties. In particular, G\ differs from G only 
in the loops. 

Remark 2.3 (Hierarchical structure of G n ). For every initial digit 
x e {0, 1, . . . , TV — 1} ; consider the set W x of vertices (xi . . . x n ) of G n 
with x\ = x. Then the induced subgraph on W x is identical to G n -\. 

We write deg n (a;) for the degree of a vertex in G n , including the loop 
which increases the degree by 2. However, for an x G Si, degx denotes 
degree of x in G. In particular deg 1 (x) = deg(x)+2. In what follows, we 
will frequently use £(x), the length of the longest block from backwards 
in x which has a nonzero type, 

(2) £(x) := max{typ(x n _ i+ i, . . . x n ) G {1,2}} 

Remark 2.4. The degree of a node x G S n 

deg„(V) = 2 + S(x) ■ deg(a; n ), 

where 

S(x) : = 1 + deg(x n _ 1 ) H h deg(x„_!) deg(x n _ 2 ) • • • deg(x„_^ fe)+ i) 

t(x)-\ / r 

( 3 ) = e n de g(^) 

r=0 \i=l 
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where the empty sum is meant to be 1. 

The following two examples satisfy the requirements of our general 
model. 

Example 2.5 (Cherry). Barabdsi, Ravasz and Vicsek [3] introduced 
the "cherry" model presented on Figures 1(a) and \l(b) ; Let V\ = {1} 
and V 2 = {0,2}, E(G) = {(1,0), (1,2)}. 

Example 2.6 (Fan). Our second example is called "fan", and is defined 
on Figure^ Note that here \Vx\ > 1. 



2.2. The embedding of the adjacency matrices into [0,1] 2 . In 
this Section, we investigate the sequence of adjacency matrices corre- 
sponding to {G n } nGN . R oughly speaking, we will map them in the unit 
square, see Figure l(c)[ 

To represent the adjacency matrix of G n as a subset of the unit square, 
first partition [0, l] 2 into N 2n congruent boxes, i.e. divide [0, 1] into 
equal subintervals of length corresponding to the first n digits of 
the iV-adic expansion of elements of [0, 1]: 



(4) 



EXy ^ \ 



r=l 



r=l 



,V(xi ...x n )e s n . 



We partition [0, l] 2 with the corresponding level-n squares: 



(5) 



Qi 



A natural embedding of the adjacency matrix of G n in the unit square 
is as follows: 



(6) 

That is, 



A n (a,b) :-- 



1, if (a,b) eQ(,),(f) eE(G n ); 
0, otherwise. 



A n (a,b) 



E 

f)eB(G„) 



(a, b). 



We write A n for the support of the function A n (a,b), see Figure |T(c 
Observe that A n is a compact set and A n+i C A n holds for all n. So 
we can define the non-empty compact set 



(7) 



A 



n=l 
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(b) G 3 




(c) The sets Ai, A 2 , A 3 



Figure 1. d,^,^, Ai, A 2 , A 3 for the cherry Example 2.5 



Clearly, 



1a(o, b) = lim A n (a,b). 



Remark 2.7. This representation obviously depends on the labeling 
of the graph G. For an arbitrary permutation tc of {0, . . . , N — 1}, the 
corresponding representation ofG n is denoted by A^(a,b). The relation 
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between these two representations is given by the formula 
A-l(a,b) = A n (<y9 7r -i(a),^ 7r -i(6)), and 
l M (a,b) = l A (cp^-i(a),cp^-i(b)), 
where the measurable function ip. K {x) : [0, 1] — > [0, 1] is defined by 

OO , x 




V 21 = U x ),xeV 2 ,yeV 1 



i=l 

2.3. Graph-directed structure of A. Now we prove that the limit A 
(defined in Q) can be considered as the attractor of a not irreducible 
graph-directed self-similar iterated function system, (for the definition 
see [8]), with the directed graph Q defined below. 

Definition 2.8. The vertex set V(Q) is partitioned into three subsets: 

vm= {C)''" GSi 

(8) V 12 = ^(^j,xeV 1 ,yeV 2 

x 

y 

Then 

V(G) = V dd UV 12 UV 21 . 

The set of directed edges E{Q) of Q is as follows: First we connect 
all vertices in both directions within each of the three sets Vdd, V\ 2 and 
Vzi (loops included). Then there is an outgoing edge for each vertex in 
Vdd to all vertices in V\ 2 and V 2 \. 

For every directed edge e = (vi,v 2 ) G E{Q) we define a homothety: 

1 1 fxi 

(9) f e ■ Qv 2 ^Q Vl , fe(a,b) := —(a,b) + —(x 1 ,y 1 ), with v { = 

~ N N \y { 

where Q v := is the level- 1 square for v = Q) G V{Q). 

The graph Q corresponding to the graph sequence in the "cherry" ex- 
ample is given by Figure [2j 

In general, Q is given by the schematic picture on the right hand side 
of Figure [2j where the double arrow in between the complete directed 

graphs K,(y}) illustrates that we connect all pairs of vertices in the 
given direction. 

Let V n be the set of all paths of length n in Q, i.e. 

V n := {v = ( Vl . . . 0|V 1 < i < n (v i: v i+l ) G E(G)} . 
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FIGURE 2. The graph Q for the "cherry", Example [2~5j 

For a v = (vi . . . v n ) = (^'"^") G V n it immediately follows from defi- 
nitions (|5| and ^ that 

(10) Q„ = /,([o,if)=4...„x/ M „, 

where 

fv(-) ■ = f(v u v 2 ) o ■ ■ ■ o if n > 2, 

(11) 11 fx s 
fv {a, b) : = —{a, b) + —{x,y), if n = 1, v = I 

The key observation of connecting Q to the graph sequence G n is the 
following: 

Claim 2.9. For a// n we have 

E{G n ) = V n . 

Proof. Let v = (vi...v n ) = (H'"^) G E n x S n , thus a = (ai . . . a„) 
and b=(b 1 ... b n ) are vertices in G n . First we assume that v G E(G n ). 
Observe that by Q, (J) are vertices in Q. We would like to prove that 
the sequence 

(::)■• 
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If k := \a A b\ > 1, then for i < k, di = bi holds, thus the sequence 



of points (^) ... forms a path in K\N\(Vdd)- By (1), the pairs 
(bt+l) > • • • ' W) are a ^ e dges in G thus vertices in Q. Furthermore, ei- 
ther they all belong to V\2 or they are all contained in V21, see pi). This 



implies that this postfix also forms a path in .^1^1(^12) or in i^|A^| (V21) . 
By definition of E(G), ((£), fc 1 )) is an edge in g, so (£) . . . (£) is 



a path in Q. If A; = then the whole path is contained either in V 12 or 
in V21. This completes the proof of ( [12] ). 

On the other hand, if (^) . . . (£"J is a path of length n in then we 
claim that for a — (a% . . . a n ),b = {b\ . . . b n ) G V(G n ) 

(a,b)EE(G n ). 

The proof is very similar to the previous one. □ 
In this way we can characterize A n as follows: 
Corollary 2.10. 

An= |J Q V _= |J 4([o,i] 2 ). 



Proof. Immediately follows from ^ and ( 10 ) and the assertion of the 
Claim EU □ 

Let us define 

^00 := {v = {v lV2 . . . )|Vi e N, K, v i+1 ) e E(g)}. 

00 

Now for every v G we have f] Q( Vl ... Vn ) is a point in [0, l] 2 , which 

n=l 

will be denoted by ELj. That is, 

00 

Il-.V^^IO,!} 2 , U(v) := fl Q (vi ... Vn) = lim f vl ... Vn (0,0). 

n=l 



It is an immediate consequence of Corollary |2. 10 that 



(13) n(7 ? 0O ) = A, i.e. A = \J U v . 

This means that A n , the embedded adjacency matrix of G n , can be 
considered as the n-th approximation of the fractal set A. 
In this way we coded the elements of A by the elements of V^. This 
coding is not 1 — 1 for the same reason as the iV-adic expansion is not 
1 — 1. However, if neither of the two coordinates of a point (a, b) G A 
are A^-adic rational numbers, then (a,b) has a unique code. 
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2.4. Fractal geometric characterisation of A. For notational con- 
venience we define the set of finite words above the alphabet Vm (in- 
cluding the empty word as well): 

V£ := {v\ 3n G N U {0} , v = K . . . v n ) and v t G V dd } . 



The three subgraphs K\E\(Vi2), K\e\(V2i) an d K\E\(Vdd) of G are com- 
plete directed graphs. We consider the three corresponding self-similar 
iterated function systems (IFS): 





- {fv} veVdd 




= {fv} v& Vl2 




— {fv} vG v 2 i 



where the functions f v ,v G V(Q) were defined in (11). The attractors 
of these IFS-s (see [HI p. 30]) are the unique nonempty compact sets 
satisfying 

Add := (J fv(^dd) = {n(u)|u = «a • • • ) and Uj G V dd } 

vev dd 

(14) A12 := (J /«(Ai 2 ) = {II(v)|v = (t>i,i> 2 • • • ) and G Vi 2 } 

t<GVl2 

A 2 i := [J /t,(A 2 i) = {n(v)|u = (v u v 2 . . .) and v € G V21} • 

06^21 

The Open Set Condition (see e.g. [HI P-35]) holds for these IFS-s, so we 
can easily compute the Hausdorff-dimension of the attractors. Clearly, 
A^ is the diagonal of the unit square. 

Now we prove that A is a countable union of homothetic copies of these 
attractors. 

Theorem 2.11. 

A = DiagU |J (/„(A 12 ) U / £ (A 21 )) , 

A dd 

where Diag = {(x, x) : x G [0, 1]}. 

Remark 2.12. Observe that A 2 i is the image of Ai 2 by the reflection 
through the diagonal, hence A is symmetric to the diagonal. The same 
is true for the n-th approximation A n of A. This can be seen immedi- 
ately by using the embedded adjacency matrix characterization of A„ . 



Proof of Theorem \2.11\ We start by showing that 
(15) Ac Diag U |J (4(A 12 ) U 4(A 21 )) 



^y dd 
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Pick an arbitrary point (a, b) G A. As a consequence of (13) there 
exists a v = (viv 2 ■ ■ ■ ) G Too such that ll(t>) = (a, 6). Let : = 
max{£ : t>£ G Add}- We distinguish three cases: k = 0, k = oo or 
< A; < oo. Mind that for all i < k,Vi G since once the path left 
the component V^, there is no way to return. Since V\ 2 and V 2 \ are 
closed, for k < 00 all v i) i> k are in the same component Vi 2 or V 2 \. 

Case = 0: Clearly either all t>j are in V12 or in V21, so n(u) G 
A12 U A 21 . 

Case A; = 00: For the same reason, H(y) = lim f Vl ...v n {0,0) G 

n— >oo 

A rfd = Diag. This is so because f vl ... Vn (0, 0) is in the neigh- 
borhood of the diagonal {(x,x) : x G [0, 1]}. 
Case < < 00: Let w fc = (v 1 . . . t^). For symmetry, without 
loss of generality we may assume that V}~+i G Vi 2 . As in the 
first case, we can see that for w := {yk+iVk+2 ■ ■ ■), n(w) G A 12 . 
Hence U{v) = /«. (iy G /,,(A 12 ). 



Now we have verified (15). To prove the opposite direction, that is 



(16) 



AD Diag U |J (/„(A 12 ) U / £ (A 21 )) 



we will use the symbolic representation of A given in (|13j). 

Pick an x G [0, 1] and take the A^-adic code (xxx 2 ■ ■ ■ ) of x. That is, 



x 



£ %,Xi e{0,...,N-l}. Then 



n=l 




it is easy to see that H(v) = (x,x). So by (13), (x,x) G A 



Now we assume that (a, h) G |J (/^(A i2 ) U /^(A 2 i)). Without loss 



of generality we may further assume that (a, b) G fv{Ai 2 ) for some 
v G V^ d . That is, (a, b) = / fi (a', f) where (a', 6') G A i2 T By Q there 
exists a. w := (wiW 2 ■ ■ ■ ), W; 6 V\ 2 such that II (w) = (a', 6'J7ln this 
way, for the concatenation t := G we have (a, b) = H(t) which 
implies (a, b) G A. This completes the proof of (16). □ 
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2.5. The same model without loops. Let G' n be the same graph as 
G n but without loops, i.e. V(G' n ) = V(G n ) and E(G' n ) C £„ x S n : 



E{G'n) = {[-) {typ(I),typ(y)} = {1,2} and 




V|x Ay| < % < n ^ y ) £E(G) 

In this case A' n = A n \Diag n , where Diag n is the union of the level 
n squares that have nonempty intersection with the diagonal. The 
sequence is not a nested sequence of compact sets. However, it is 
easy to see that the characteristic function of A^ tends to characteristic 
function of A \ Diag. Further, A^ tends to A in the Hausdorff metric, 
see [8]. 

3. Properties of the sequence {G n } and A 

In this section we compute the degree distribution of G n , and relate it 
to the Hausdorff dimension of A. We also compute the length of the 
average shortest path in G n . To get interesting result about the local 
clustering coefficient we need to modify our graph sequence G n in the 
line as it was done in [2J. 

3.1. Degree distribution of {G n }. Here we compute the degree dis- 
tribution under the following regularity assumption on the base graph 
G: 

deg(x) := di, Wx G V\ 



(Al) 



maxdeg(y) := d 2 < d 1 - 1, Wy G V 2 

j£V 2 



Recall that we defined £(x) in ^ as the length of the longest block 
from backwards of the node x such that the last £(x) digits of x belong 
ut Ejj := {x G E„|x n G Vi} ,i = 1,2. It follows from 

I i d e{ - )+1 -i 
I A that the degree of a node x G S„ is 1 di _ 1 h 1, 

and the number of such nodes with £(x) = i is exactly N n ~ e+1 ■ n 2 ■ n[ . 



to t he same Vi. 
Al and Remark 



Under assumption Al , the decay of the degree distribution is deter- 



mined by the set of high degree nodes denoted by 



HD n := { x G | deg n (x) > maxdegjy) \ . 



An equivalent characterisation of HD n is 



HD n =\xe El\£(x) > 7—r max{(n + 1) log(d 2 ),logn}l . 

I log Gil J 
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(a) G on the left and G\ on the right hand 
side. Here Vi = {2, 4} and V 2 = {0, 1, 3, 5} 




(b) The graph G2 (contains additionally all loops). 

Figure 3. Example "fan". 

This is so because the degree of any y e is at most maxjci^" 1 " 1 , n} . 
The tail of the cumulative degree distribution is 



P 



deg n (X) > 



d[ +1 



+ 1 



l+l ATn-t-l 



71* iV 



N r 



rii\ 
N) 



e+i 



where X is a uniformly chosen node of G n . Mind that as long as I < n, 
this probability does not depend on n. Writing F{t) = P(deg(X) > t) 
for the tail of the cumulative distribution function we get the power 
law decay 

d? 1 - 1 



~_ _log(JV/ni2 

F(t)=t 1o ^i • c(di) fort 



di 



So we have proved 
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Theorem 3.1. The degree distribution of the graph s ecfuence Gn sat- 



isfying as sumption \ A 1\ has a power law decay with exponent 
17 7 = 7 -i = — — - — . 

This implies that the largest decay 7 we can get in this family of models 
is 1 + and the maximum is attained at n\ = 1 and d\ = 2 = n 2 . 



This is exactly the graph sequence in Example 2.5 see Figures 1(a) and 



1(b). We will later see that the case n\ = 1 is important in another 



sense as well, see Section 3.2 



3.2. Hausdorff dimension of A. In Theorem |2.11| we decomposed A 
into the diagonal of the square and countably many homothetic copies 
of A 12 and A 2 i, both attractors of self-similar IFS-s. Hence the Haus- 
dorff dimension is the maximum of the dimension of the diagonal and 
diniH A12 = diniH A 2 i- Note that the self-similar IFS J-"i 2 consists of \E\ 
similarities of contraction ratio j^, and satisfies the Open Set Condi- 
tion. As an immediate application of [8j Theorem 2.7], the Hausdorff 
dimension of A 12 and A 21 is dim H A 12 = \° ■ 
By this argument above we have proved the following theorem: 

Theorem 3.2. The Hausdorff dimension of A is 



furthermore, 



, > / log |£ 1 , 

dim H A = max<i 1 -^,l 

\og\E 



diniH (A\Diag) 



log A 

Corollary 3.3. // \ Vi \ — n\ — 1, then \A1 ) holds with di = \E\ in the 



bipartite G. Hence the degree distribution exponent p?| ) equals 

logiV 



7 



log I £ I dim H (A\Diag) ' 



3.3. Average shortest path in G n . In many real networks, the typ- 
ical distance between two randomly chosen points is of order logd^l), 
the logarithm of the size of the network. We will see that our model also 
shares this property as well as the power law decay and the hierarchical 
structure, combining all these important features. 
In this section we calculate the average length of shortest path between 
two nodes in G n . First we give a deterministic way to construct one of 
the shortest paths between any two nodes in the graph. To do so, we 
need to introduce some notation. Recall that the graph G is a bipartite 
graph with partition Vi, V 2 , see the beginning of Section |2j We remind 
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the reader that for x, y G S n , typ(x), the common prefix xAy and the 



postfixes x, y were defined in Definition 2.1 



Definition 3.4. 

For two arbitrary vertices i,f/6E„ we denote the length of their com- 
mon prefix by k = k(x,y) := \xAy\. Furthermore, let us decompose 
the postfixes x, y into blocks of digits of the same type: 

(18) x = b x b 2 ...b r , y = c x c 2 ...c q} 

such that all of the blocks have a nonzero type and the consecutive blocks 
are of different types. That is, for i — 1, . . . , r — 1, j = 1, . . . q — 1 we 
have 

twipi) ^ typ{b i+1 ) G {1,2}, and typ(cj) ^ typ(c j+1 ) G {1,2}. 

Note, that we denoted the number of blocks in x,y by r and q, re- 
spectively. If X_ and Y_ are two random vertices of G n , then the same 
notation as in (18) is used with capital letters. 

Now we fix an arbitrary self-map p of such that 

(x,p(x)) G E[G) Vx G G. 

Most commonly, p(p(x)) ^ x. Note that x and p(x) have different types 
since G is bipartite. For a word z = {z x ... z m ) with typ(z) G {1, 2} we 
define p(z) := {p(z\) . . .p(z m )).. Then, 

(19) (tz,tp(z)) is an edge in Gi +m ,Vt = {t x . . .t e ), 
follows from 0. 

As usual we write Diam(G) for the maximal graph-distance in the 
graph G within components of G. Clearly Diam(G) < N — 1. 

Lemma 3.5. Let x,y be arbitrary vertices in the same connected com- 
ponent of G n . Using the notation above, the length of the shortest path 
between them is at least r + q — 1 and at most r + q + Diam(G) — 2. 

Considering the worst case scenario, i.e. choosing all blocks of length 
1 yields: 

Corollary 3.6. The diameter of the graph G n is at most 2n+Diam(G) — 
2. Since the size of the graph is N n , therefore 

Diam{G n ) = -\- log(|G n |) + O(l). 
log N 
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Proof of Lemma 3.5 . First we construct a path P(x, y) of minimal length. 
Starting from x the first half of the path P(x, y) is as follows: 

x° = x = (x A y)by . . . b r _ 1 b r 
x — (x A yjbi . . . 6 r _ 1 p(6 r .) 

i r_1 = (xA y)&iP(& 2 • ■■p(b r -iP(b r ))), 
Starting from y the first half of the path P(x, y) is as follows: 

y° = y = (x a y)c x c 2 ■■■c r 

V 1 = (x A y)c 1 . . . c r _ x p(c r ) 
if- 1 = (xAy)c 1 p(c 2 ...p(c r _ 1 p(c q ))). 



It follows from (19) that 



P x . {x_ , X^ j . . . , 3? ) 

P„: = (f-V--y\£ ) 
are two paths in G n . To construct P{x,y) the only thing remained is 



to connect x r 1 and . Using (19) it is easy to see that this can be 
done with a path P c of length at most Diam(G). In this way, 

P{x,y) :=P x P c P y . 

Clearly 

r + q — 1 < Length(P(x, y)) < r + q + Diam(G) — 2 

On the other hand, now we prove that no shorter paths exists than 
P(x,y). Recall that it follows from (|T| that for any path Q(x,y) = 
(x = q°, . . . , q l = y), the consecutive elements of the path only differ 
in their postfixes, which have different types. That is, 

Vi,^ = iV, q l+1 = y/z 1 , with typ(^) ^ typ(f) G {1, 2}. 

This implies that in each step on the path, the number of blocks in 



(18) changes by at most one. Recall that \x Ay\ = k, so Xk+i ^ yk+i- 
Since the digit on the k + 1-th position changes on the path, we have 
to reach a point where all the digits to the right from the /c-th position 
are of the same type. Starting from p° = x, to reach the first vertex 

a of this property, we need at least r — 1 steps on any path P, where 



r was defined in formula (18). Similarly, starting from y, we need at 
least q — 1 steps to reach tfienrst vertex b where all the digits after the 
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k-th position are of the same type. Because Xk+i ^ Vk+i, we need at 
least one more edge and at most Diam(G) edges. □ 

Theorem 3.7. The expectation of the length of a shortest path between 
two uniformly chosen vertices X, Y_ G G n can be bounded by 

^(n-l)<n\P(X,Y)\)<N + ^(n-l). 

Corollary 3.8. The magnitude of the average length of a shortest path 
between two uniformly chosen vertices in G n is the logarithm of the size 
of G n , which is the same order as Diam(G n ) . 



Proof of Theorem 3.7, Let X, Y_ be independent, uniformly chosen ver- 



tices of G n . In this proof we use the notation introduced in Definitions 



2.1 and |3.4| The digits of the code of a uniformly chosen vertex are 
independent and uniform in{0,...,iV — 1}, hence K(X_, Y_) := \X AY\ 
has a truncated geometric distribution with parameter ^r 1 - That is 

' fl\ k N-l „ 

V(K{K, X) = k) ) ' 1 



A , , if k = n. 

Furthermore, given that the length of the prefix is k = K[X, Y), the 



random variables R and Q (see Definition 2.1) can be represented as 
the sum of indicators corresponding to the start of a new block: 

71— k— 1 

R = 1 + ^2 1 typ(^fc+ 1 )^ty P (X fc+l+1 ), 



1=1 

71— k— 1 



Q - 1 + ^2 1 typ(^+ l )^t yP (y fe+I+1 )- 
i=i 

Taking expectation yields 

E(Q\K(X,Y) = k) = E(R\K(X,Y) = k) 

/n-k-l 

= 1 + El ^ ltyp(X fc+1 )^typ(X fc+l+1 ) 



i=l 
n—k—1 



1 + P(typ(X fc+l ) + t yP (X fc+m )) 



=i 



. 2n\no 
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So weighting this with the geometric weights of the length of the prefix, 
we get 

E(Q) = E(R) = E(E(R\K(X,Y))) 

= E(l + (n-K(X,F)-l)^) 

1 / 2n,n 2 



N - 1 V iV™ / 7 iV 2 



Using this and the following immediate consequence of Lemma 3.5 

-1 < E(\P(X,Y)\ - (R + Q)) < Diam(G) -2, 
finally we obtain that 

1 - + ^(n - 1) < E(\P(X, Y)\ < Diam(G) + ^(n - 1). 

□ 

3.4. Decay of local clustering coefficient of the modified se- 
quence |G n |. An important property of most real networks is the 

high degree of clustering. In general, the local clustering coefficient of 
a node v having n v neighbors is defined as 

#{links between neighbors of v} 

° := m ■ 

Note that the numerator in the formula is the number of triangles 
containing v and C v is the portion of the pairs of neighbors of v which 
form a triangle with v in the graph. 

Observe that without the loops the graph sequence G n is bipartite, i.e. 
there are no triangles in the graph G n . However, we can modify the 
graph sequence G n in a natural way, like in [2], to get a new sequence 
G n preserving the hierarchical structure of G n , still reflecting the de- 
pendence of clustering coefficient on node degree observed in several 
real networks. Namely, the local clustering coefficient of a vertex v is 
of order 1/ deg(f). 

Definition 3.9. 

• We obtain the graph G adding a set of extra edges RE(G) to G 
satisfying the following property: 
Property R 

Vx G Ei, By, z G Ei, such that two among the edges of the 
triangle (x,y,z)& are contained in E{G) and one of the edges 
is in RE{G). 
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G G 




(b) Gi \ The edges of G2 and G2 differ only at the lowest hierarchical level (cf. 
Figure ^ 

Figure 4. Clustering extended "fan". 



So, 

V(G) = V(G) and E(G) = E(G) U RE{G). 



In the example presented on Figure\Qthe edges from RE{G) are 
the dashed red edges. 

Similarly we define the graph sequence j^n j by deleting all 

loops in G n and adding extra edges to G n . That is, the vertices 
V{G n ) = V{G n ) = S n; and with the definition of the simple 



graph G' n in Section 2.5, the edge set is extended by the following 
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rule 

(20) E(G n )=E(G' n )\jRE(G n ), 
where 

(21) RE(G n ) = {( Xl --- Xn ):x i = y h i<n-l, ( Xn ) G RE{G) 

It is clear from Property R that 

(22) := min Co, > 0. 

Further, using and @ one can easily see that the degree of a 
vertex x G G n is 

(23) deg n (x) = S(x) ■ deg(a; n ) + (deg(x n ) - deg(x r 

where deg(.) denotes the degree of a vertex in G, while deg(.) stands 
for the degree in G. 

Remark 3.10. The difference between the degree of any node x G S n 
in G n and in G n is bounded, thus the degree sequence of G n has the 
same power law exponent as G n . 

Theorem 3.11. There exists Ki,K 2 > such that the local clustering 
coefficient C x of an arbitrary node x G G n satisfies. 

K > <c,< K > 



deg n (x) deg n (x) 

Proof. We write T n {x) for the set of all triangles in G n containing the 
node x G S n . We say that a triangle (x,y,z)^ G T n (x) is regular 
if and only if exactly two of its edges are from E(G n ). The triangle 
{z,y,z)A G T n (x) is called irregular if it is not regular. The set of 
irregular triangles containing x is denoted by T1ZT n (x). We partition 
the set of regular triangles lZT n {x) into the classes: 

nr n (x) = nr l n (x)unr 2 n (x) 

in the following way: A triangle (x, y, z)a G TZT n (x) belongs to TZT n (x) 

if and only if x is NOT an endpoint of the edge contained in RE{G n ). 
That is 

KT l n {x) := {(x,y,z) A G KT n {x) : Q G £(G n ).j 
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Hence, VJT^ix) is the set of those (x, y, z) A G 1ZT n (x) for which either 
(^) G E{G n ) and (|J G RE{G n ) or vice versa. Summarizing these 
partitions: 

7;(x) = nT n {x) u xnr n {x) = nr x n {x) u ^r^(x) u inr n {x) 

Now we define the cardinality of these classes: 

K&) := #7er 1 (x) ) A^(x) := #ftT 2 (x) and A£(z) := #ZftT(x). 

When n = 1 then we suppress the index n. Observe that by Property 
R, 

A T n (x) := A l n (x) + A 2 n (x) > 1, Vn > 1, x G E„. 

Now we compute A^(x), i G {1,2, ir}, for an arbitrary fixed x G S n . 
To do so the notation £(x) will be used. First we verify that 

l(x)-l r 

(24) Alix) = Hdeg{x n -j) ■ = S(x) ■ A\x n ), 

r=0 j=l 

where S(x) was defined in ([3]). To see this, observe that it follows from 
Q, @ and that 

(x,y,z) A G HT\{x) 

holds if and only if all of the following three assertions are satisfied: 

(1) 30 < r < £(x) — 1, |yA^| = n — 1 and |xAy| = \x_Az\ = n — r — 1 

(2) G -E(G) whenever n-r</c<n-l 

(3) {x n ,y n ,z n ) A G 7£T ( 

Hence (24) is obtained by an immediate calculation. 
Now we prove that 

l(x)-l r 

(25) A 2 n (x) = J2 n de g^) ■ A2 ( x ») = S ® ■ A "M- 

r=0 j=l 

This is so because by ([I]), pO] ) and (21) we have 

(x,y,z) A G 7ZT 2 n (x) 

holds if and only if all of the following three assertions are satisfied: 

(1) 30 < r < £(x) — 1, |xAy| = n — 1 and \xAz\ = \yAz\ = n—r — 1, 

(2) G -E(G) whenever n — r < k <n - 1 

(3) (x„,y„,z„)A G TZT 2 {x n ). 

Hence, using the same argument as above we get (25). 

Finally, we determine the number of irregular triangles containing x: 

(26) A«(x) = A"(x n ). 
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This follows from the fact that 

(x,y,z) A e ZKT n (x) 

is equivalent to 

VI < i < n - 1, Xi = i/i = Zi and (x n , y n , z n ) A e lKT{x n ). 

We write Za(x) for the number of all triangles in G n containing x: 

Z A (x) := Al(x) + Al(x) +A»(x). 
v v ' 



Using ([23]), (|24j), (|§ and §2§ we get 

2A r (x n ) • S(x) + 2A ir (x n ) 



(27) 



Grr. 



Za(x) 



(de g „fe)^ deg n (x)(deg n (x)-l) 
where S(x) was defined in ([3]). Now we estimate Cx- 

Claim 3.12. 

(i) : If£(x) = l, thenC x _ = C Xn . 

(ii) : If £(x) > 2 ; i/ien we have 



(2? 



deg(x n ) deg n (x) 



< 



const 

-2 

deg n (x) 



Proof of the Claim. Part (i) immediately follows from ([I]). To prove 
(ii) we fix an arbitrary x G E n with £(x) > 2. Since t,u,v introduced 
below depend only on x n there exists a constant C* independent of n 
and a; such that 
(29) 

M := deg(x n ) - deg(x„ 



< t :-- 



deg(x r . 



v :- 



2A ir (x„) < a. 



To prove ( 28 ) it is enough to verify that 



Q := (deg n (z)J (deg n (x) - lj •C gi -2t- (deg n (x) - 1) 
is bounded in n and i£S„. This so, because by (23) and (|27|) we have 

Q 



2A r (x n ) -S + v-2t(S- deg(x n ) + u -l) 

~ ^ ' 

deg(x) 



2A r \x n ) -S + v- 2A r (x n ) ■ S -2t(u - 1) 
v - 2t(u - 1), 



2tSdeg(x n ) 



which is bounded by ( 29 ) . 



□ 
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Property R implies that both C Xn and 
zero. This completes the proof of the Theorem |3.11 



A r (x n ) 
deg(x„) 



are bounded away from 

□ 



The following theorem shows that the graph sequence G n displays sim- 
ilar features to that of considered in |2], namely, the average local 
clustering coefficient of the graphs G n is not tending to zero with the 
size of G n . 

Theorem 3.13. The average local clustering coefficient C(G n ) of the 
graph G n is bounded by two positive constants, more precisely 

2nin 2 C n 



(30) 

where C7, 



N 2 



< C(G n ) < C(G), 



was defined in (22). 
Proof. We will use the notation introduced in the proof of Theorem 



3.11 It easily follows from the proof of Theorem 3.11 that 

(31) Cz<C Xn . 

Namely, if £(x) = 1 then by gl), C £ = C Xn . 1 
thus using ( |27| ) we obtain 

AHxJ + A h (x n ) /deg(x, 

2 



a < 



x)>2 then S(x) > 1 
S{x) 



<1 



This completes the proof of (31 ) from which the upper estimate of (30) 



follows by averaging. On the other hand to see that the lower estimate 
holds we take into consideration only the contribution of x G S n with 
£{x) = 1. 



C{G n ) > 



N n - 2 n 2 



zev 2 



Using C z > C m i n , the lower bound of (30) follows 



□ 



4. Definition of the randomized model 



In this section we randomize the deterministic model in Section [2] by 
using A in [0, l] 2 . The random graph sequence G r n is generated in a 
way which was inspired by the PU-random graphs introduced by Lovasz 
and Szegedy [ID] . See also [5]. 

Fix a deterministic model with a base graph G, \V(G)\ = N. This de- 
termines A(a, b) the limit of the sequence of scaled adjacency matrices, 
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see the definition ^ and ^ in Section 2.2 Now for each n, we throw 
M n + 1 independent, uniform random numbers over [0, 1]: 

X^,X^,...,X^ +1 ^U[0,1], i.i.d. 

We denote the iV-adic expansion of each of these numbers by 

X« = (*I,X5,...), i.e. XV=J2j£> 

k=l 

where the X£-s are uniform over the set {0, 1, . . . , N — 1}. The n-th 
approximation of X® is 

k=l 

Now we construct the random graph G T n as follows: | V(£r*) | = {1, . . . , M n }, 
and E(G T n ) is given by 

E{GQ = ((z,j)| int ( J (4) x I xW ) n A ^ 

where int denotes the interior of a set. Clearly, 

S(G^ = {(i l j)|A B (Jf« Jf«) = l}. 

Note that 

Namely, we can think of the first n digits (X[, . . . , X„) and (X^, . . . , X 3 n ) 
of the N-adic expansion of X^ l > and as vertices in G n . We draw an 
edge between the two vertices i and j in G r n if the vertices (X] . . . X l n ) 
and {X{ . . . Xfy are connected by an edge in the deterministic model 
G n . This gives the following probabilistic interpretation of the random 
model: 

Remark 4.1. Consider the deterministic graph sequence G n with urns 
sitting at each vertex v G G n . Now throw M n + 1 balls independently 
and uniformly into the urns, and connect vertex i to vertex j by an 
edge in the random graph G r n if and only if the urns of ball i and j are 
connected by and edge in G n . 

We need to introduce some further notation. 

Frequently used definitions. Under assumption Al for an x € G n 

with £(x) = k the degree of x is 

d\ +l - 1 1 

t k := —i - + 1, 

di — 1 



GRAPHS GENERATED BY FRACTALS 



25 



independently of the length of n. 

In the random graph G T n , the conditional probability of the degree 
distribution of a random node V G {0, . . . , M n } conditioned on the 
first n digits of the N-adic expansion of the corresponding code 
follows a Binomial distribution: 



(32) 



( deg( V) | (XY ...XV)=x)~ BIN (V n , 



te(x) 
~N™ 



This follows from the characterization of G r n described in Remark 4.1 



Namely, assume that the V-th. ball has landed in urn with label x e E n . 
In G n there are exactly deg n (x) — 1 = t^) vertices y 6 E n that are 
connected to x. All the balls landing into urns corresponding to these 
vertices y will be connected to V in G T n . 

5. Properties of the randomized model 

In this section we determine the proportion of isolated vertices and 
characterize the degree sequence. 

5.1. Isolated vertices. 

Theorem 5.1. If M n = c n N n with lim c n = oo, then the fraction 

n— >oo 

of isolated vertices tends to zero as n — > oo. More precisely, for a 
uniformly chosen node V e G r n , 

P(deg(y) = 0) < e- d ^ aCn , 

where <i m ; n stands for the minimal degree in the base graph G, and in 
deg(.) we do not count the loops. 

The following corollary is an immediate consequence of the Borel- 
Cantelli lemma. 

oo 

Corollary 5.2. If c n N n e~ dininCn < oo, then almost surely there will 

71=1 

be only finitely many n-s, for which the graph G r n has isolated vertices. 
The assumption of the Corollary is satisfied if e.g. c n > n\og(N + 1). 
Proof of Theorem 5. 1 , Given the N-adic expansion of X( y ), the prob- 



ability that a vertex is isolated depends on how many neighbors the 
vertex (XY ■ ■ ■ X^) has in the deterministic model. So we can write 

P(deg(\/) = 0) = P(deg(V0 = 0\(X? ....*£)= x) — 
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As we have already seen, (deg(V)\(XY ■ ■ ■ X^) = xj follows a Bino- 



mial distribution with parameters M n and degn S^ 1 , so the conditional 



probability of isolation is 

P(deg(V)=0|(xr...^)=x)=^l-^j 

< 6-^^(1 + 0(1)). 

Obviously e~ degn ^ Cn < e~ dininCn holds for all x G S„, which completes 
the proof. □ 

5.2. Decay of degree distribution. Fix a constant K such that for 
a standard normal variable Z, P(|Z| > X) < e~ 10 . We write 

4,n := [C n tfc — Ky/c n tk, C n t k + -^V C n tfc] , 

and 

fco(n) := max i (n + 1) 



I logdi \ogdi 

Now we describe the degree distribution for the random model. 

Theorem 5.3. Let k > k (n) and u G I k , n - Then for a uniformly 
chosen node V in G r n 

P (deg(V) = „) = %■* /-^ ) (1+0(^1). 

\NJ N y/c n t k \J Cntk (i_M_y y y/ c ntk 

where denotes the density function of a standard Gaussian variable. 
This immediately implies 

Corollary 5.4. The degree distribution of the random model is given 
by the following formula for a,b G [-K, K] : 

P (deg(V) G [c n t fc + aVc~J~ k ,c n t k + kv 7 ^]) = (^)* ^ • ($(&) - $(«)) 

+ O (G\0 7^)' 

where k > k (n) and $ denotes the distribution function of a standard 
Gaussian variable. So, foru G Ik, n , k > k (n) the tail of the probability 
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distribution is: 
(33) 

p(de g( v) > „) = + f^-Y n 4 f i - $r u ~ Cntk 



NJ \NJ N 

ni\ k+l _/ 1 



+( * ro 



\/~Cntk 



This holds because P(deg(V) > w) equals the sum of all probability 
mass that is concentrated around ti-s for I > k+1, resulting in the first 
term, plus the second term coming from the part greater than u of the 
binomial mass around %. As a consequence, the decay of the degree 
distribution follows a power law. Namely, the following holds 

Theorem 5.5. Let 



7 := 1 + 



log dt 

Then the decay of the degree distribution is: 

P(deg(V) >u) = u- Y+1 -L(u), 
where L(u) is a bounded function: 

J. < L{u) < -. 
m m 



The idea of the proof of Theorem 5.3 The conditional distribution of 



the degree of a node V conditioned on the n-digit N-adic expansion of 
X { n ] = x follows a BIN(c n N n , ^) law. This is close to a POI(c n t e(gi) ) 
random variable, because c n and te( x ) tend to infinity in a much smaller 
order than N n . Now for the POI{c n t^) variable, the Central Limit 
Theorem holds with an error term of order 1 / ■ s Jc n t^ x ) . Now the un- 
conditional degree distribution comes from the law of total probability 
and from the fact that all other errors are negligible. □ 



Proof of Theorem 5.3. We determined the degree distribution of the 



deterministic model under assumption ( Al ), see Section 3.1 for details. 



Recall that if k > k (n) } then the mass at tk is 

We show that in the random model G r n , these Dirac masses are turned 
into Gaussian masses centered at c n tk- Suppose u G i& n . By the law 



28 



JULIA KOMJATHY AND KAROLY SIMON 



of total probability, we have 
(34) 

P(deg(^) =u)= P(deg(V0 = u| (X? ■ ■ ■ X%) = x, %) = k) ■ p k 
+ S\ + S2, 

where 

fe-i 

S x = ndeg(V) =u\(XY... Xl) = x, £(x) = j) ■ Pj 
3=1 

n 

S 2 =Y< nteg(V) = u\ (XY . . . Xl) = x, £(x) = j) ■ Pj 
j=k+i 

Si and S2 combines the total contribution of cases when t{X\ . . . Xl) ^ 
k, i.e. referring to the urn model of our random graph, Si + S2 settles 
the cases when the random ball V falls into an urn which has degree 
different from t k in G n . As a first step in our proof we show that the 
right hand side in the first line of (34) gives the formula in Theorem 



5.3, then as a second step we verify that Si + 5*2 is negligible. 



First step: Following the standard proof of the local form of de 
Moivre-Laplace CLT, we obtain that for 



>(deg(V) = u\(XY...XV) 



+ ( 1 **(g) 




We can neglect 1 — -j^-. This completes the first step. 

Second step: Since «6 4„ we have: 

(35) 

fc-i 

Si < P(deg(K) > t k - KVh\ (XY ■ ■ ■ Xl ) = x, £{x) = j) ■ P 3 
3=1 

n 

S 2 <Y, ndeg(V) <t k + KVh\ (Xl . . . Xl) = x, £{x) = j) ■ Pj 
j=k+i 

Now we use the fact known from Chernoff-bounds: for an Z ~ BIN(m,p) 
variable 

> (1 + 6)E(Z)) <e~^ z \ 



and the same bound holds for P(Z < (1 - S)E(S)). By (Q, to 
estimate each summand in (35) we can apply these inequalities for 
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Zj ~ BIN(c n N n , jfa), j G {1, . . . , n} \ {k}, yielding an upper bound 

fc— 1 n 
j=l j=k+l 



Since e~^ d i Cn = o( J-,- 1 ), the statement of Theorem 



5.3 



follows. □ 



Now we are ready to prove the main result of the section. 



Proof of Theorem \5.5[ If u G Ik, n , then 

1 

d 



u = df ■ 1 + O 



Using (33) we obtain that there exists C(u) G [^f, 1] such that 

F(deg(V) >u)=Q) k C(u). 

The last two formulas immediately imply the assertion of the Theorem 
whenever u G Ik, n - Actually in this case we have j± < L(u) < 1. If 
u ^ Ufe/ ni fe, then there exists k = k{u) such that u G (c n tk, c n tk+i)- By 
monotonicity of the distribution function we have 

P(deg(\/) > ctfc+0 < P(deg(V) > u) < P(deg(V) > c n t k ). 

Applying the theorem for c n tk+\ and c n tk, we loose a factor of — in 
the upper bound of L(u) and the assertion of the Theorem follows. □ 
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